Accelerating External Sorting via On-the-fly Data Merge in Active SSDs
نویسندگان
چکیده
The concept of active SSDs (solid state drives) has been introduced in order to cope with the demands required to process the ever-increasing volumes of data. In active SSDs, some of the data-processing tasks are offloaded to SSDs, freeing host system resources and improving overall performance of data analysis. In this paper, we propose a novel active SSD architecture focused on improving the external sorting algorithm that is used extensively in data-intensive computing. By performing merge operations on-the-fly in active SSDs, our method can remove the extra data transfer and enhance the lifetime of SSDs. Our evaluation results on a real SSD platform indicate that the proposed scheme outperforms the traditional external sorting by up to 39%.
منابع مشابه
External Sorting on Flash Memory Via Natural Page Run Generation
The increasing popularity of flash memory means more database systems will run on flash memory in the future. One of the most important database operations is the external sort. Hence, this paper is focused on studying the problem of efficient external sorting on flash memory. In contrast to most previous work, we target the situation where previously sorted data has become progressively un-sor...
متن کاملBigSparse: High-performance external graph analytics
We present BigSparse, a fully external graph analytics system that picks up where semi-external systems like FlashGraph and X-Stream, which only store vertex data in memory, left off. BigSparse stores both edge and vertex data in an array of SSDs and avoids random updates to the vertex data, by first logging the vertex updates and then sorting the log to sequentialize accesses to the SSDs. This...
متن کاملPatTrieSort - External String Sorting based on Patricia Tries
External merge sort belongs to the most efficient and widely used algorithms to sort big data: As much data as fits inside is sorted in main memory and afterwards swapped to external storage as so called initial run. After sorting all the data in this way block-wise, the initial runs are merged in a merging phase in order to retrieve the final sorted run containing the completely sorted origina...
متن کاملSorting in Parallel Database Systems
Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multiprocessors (parallel i...
متن کاملParallel database sorting
Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multi-processors (parallel ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014